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Abstract 

In evaluating an algorithm, worst-case analysis can be overly pes- 
simistic. Average-case analysis can be overly optimistic. An intermediate 
approach is to show that an algorithm does well on a broad class of input 
distributions. Koutsoupias and Papadimitriou jl^] recently analyzed the 
least-recently-used (Lru) paging strategy in this manner, analyzing its 
performance on an input sequence generated by a so-called diffuse adver- 
sary — one that must choose each request probabilitistically so that no 
page is chosen with probability more than some fixed e > 0. They showed 
that Lru achieves the optimal competitive ratio (for deterministic on-line 
algorithms), but they didn't determine the actual ratio. 

In this paper we estimate the optimal ratios within roughly a factor 
of two for both deterministic strategies (e.g. least-recently-used and first- 
in-first-out) and randomized strategies. Around the threshold e ~ 
(where k is the cache size), the optimal ratios are both &{lnk). Below 
the threshold the ratios tend rapidly to 0(1). Above the threshold the 
ratio is unchanged for randomized strategies but tends rapidly to 0(fc) for 
deterministic ones. 

We also give an alternate proof of the optimality of Lru. 



1 Introduction and Background 

The paging problem was originally studied in the context of two-level virtual 
memory systems composed of a large, slow-access memory augmented with a 
cache (a small, fast-access memory, holding likely-to-be accessed pages in order 
to minimize access time). 

This paper concerns the following standard abstraction of this simple and 
common problem. The input is an integer k and a finite sequence s — S1S2 ■ ■ ■ Sn 
of requests. The parameter k is called the cache size. The output is a schedule 
— a sequence S1S2 . ■ . Sn of sets, where each set is of size at most fc, and each 
St contains s*. Each request St is said to occur at time t. The items in St are 
said to be in the cache after time t up to and including time t -I- 1 . An item is 
said to be evicted at time t if the item is in St-i but not in St- The cost of the 
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schedule is number of evictions. A schedule for an input is optimal if it achieves 
the minimum possible cost. 

Next we define the paging algorithms considered in this paper. Each evicts 
pages only when the cache is full and does not contain the requested item. 
Least-recently-used (Lru) evicts the item whose most recent request is the least 
recent among all items in the cache. Fir st-in- first- out (FiFO) evicts the item 
that has been in the cache the longest. Flush-when-full (Fwf) evicts all items in 
the cache. The randomized marking algorithm (RMark |^) operates as follows. 
After an item is requested, it is marked. When an item must be evicted, a non- 
marked item is chosen uniformly at random, with the caveat that if all items 
in the cache are marked, then all marks are first erased. By a deterministic 
marking algorithm, we mean any deterministic algorithm that maintains marks 
as RMark does, and evicts only unmarked items. Lru, Fifo, and FWF are 
examples. By a lazy deterministic marking algorithm (DMark), we mean a 
deterministic marking algorithm that evicts an item only when necessary, and 
then only one item. This additional requirement excludes Fwf. 

An algorithm for the problem is on-line if, for any request sequence and any 
request in that sequence, the items in the cache after the request are independent 
of later requests. In many contexts, on-line algorithms are necessary, but on- 
line algorithms are necessarily sub-optimal on some request sequences. Hence, 
a natural question is how on-line algorithms can be effectively analyzed and 
compared. 

This paper is concerned with a generalization of the standard competitive 
analysis Q of on-line algorithms. The standard model measures the quality of 
an algorithm A by its competitive ratio: the minimum (to be precise, infimum) 
c such that, for some constant 6, for all request sequences s, 

A{s) < c ■ Opt(s) -f- b. 

Here A{s) denotes the cost of the schedule produced by A on input s; Opt(s) 
denotes the cost of an optimal schedule. If ^ is a randomized algorithm, then 
A{s) denotes the expected cost of A on input s. Note that k is an implicit, 
and fixed, parameter in these definitions. Standard competitive analysis is a 
worst-case type of analysis, in contrast to much of the earlier work on paging, 
which is concerned with average-case analysis]^ 

In the standard competitive-analysis framework the following results are 
known. Any deterministic marking algorithm, including Lru, Fifo, and Fwf, 
has a competitive ratio of k; the ratio k is the best possible for any deterministic 
on-line strategy jlj, || . The randomized marking algorithm RMark has a com- 
petitive ratio of 2H{k) - 1 [||, 0, where H{k) = J2'l 1/i w ln(l -\- k). Partition 
[ p^ and Equitable more complicated randomized algorithms, each have 
competitive ratio H{k). No randomized strategy can have a better ratio than 
H{k) g. 

^ At least one work |^ preceding competitive analysis blends average-case and worst- 
case analysis. It considers input sequences where each request is chosen from a fixed but 
unknown distribution on the pages, and compares known paging strategies to the optimal 
on-line strategy for that distribution. 
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Largely due to the unrealistic magnitude of the optimal competitive ratios 

f6|, many variations on the standard model have been considered (e.g. ^, ||, 
"^,11, |l^ 0)- For a survey on competitive analysis of paging, we refer the 
reader to the recent book by Borodin and El-Yaniv [|[ ch's 3-5]. 

This paper concerns the following generalization of the standard model, re- 
cently proposed by Koutsoupias and Papadimitriou Q. For any class A of 
distributions on the input sequences and any deterministic or randomized al- 
gorithm A, define Ti{A,A), the competitive ratio of A against the A-diffuse 
adversary, to be the minimum (again, to be precise, infimum) c such that for 
each distribution _D in A, there is a constant b such that 

ED[A(r)] < c • ED[OPT(r)] + b. 

Here r is a random sequence chosen according to D. Define the optimal ratio 
for deterministic on-line algorithms (against the A-diffuse adversary) to be 

7^(A) = inf 7^(A,^), 

A 

where A ranges over all deterministic on-line algorithms. Analogously, define 
the optimal ratio for randomized on-line algorithms (against the A-diffuse ad- 
versary) to be 

nn(A) = inf7^(A, Ai^) 

where Afj ranges over all randomized on-line algorithms. 

The particular class of distributions considered by Koutsoupias and Pa- 
padimitriou is denoted A^ and is defined as follows. Any distribution D specifies, 
for each item x and sequence of requests s, the probability Pioixls) that the 
next request of the random sequence r is x given that the sequence so far is s. 
Then A^ contains those distributions D such that, for any request sequence s 
and item x, Pr£)(a;|s) < e. The parameter e is a measure of the inherent uncer- 
tainty of each request. Koutsoupias and Papadimitriou show that Lru achieves 
the optimal ratio in this model (i.e. TZ{A^, Lru) = TZ{A^ j), but they leave open 
the question of what the ratio is. 

Here we estimate the optimal ratios within roughly a factor of two, for both 
deterministic and randomized algorithms. Here is our main theorem. 

Theorem 1 Define 

k-l 

$(e. A:) = 1 + ^ max{e"^ - i, 

For any e, let e' = 1/ \e^^~\ ■ The competitive ratios of deterministic (TZ ) and ran- 
domized (TZTZ) on-line algorithms against the A^-diffuse adversary are bounded 
as follows: 



4 



range 



deterministic — 7?.(A£) randomized — TZTZ{Af^) 

lower bound upper bound lower bound upper bounds 



e < l/(fc + l) 


<^{e,k) - 1 


2$(e,fc) 


$(e',fc) - 1 


2$(e,fc) 


e > l/(fc + l) 


$(e,fc) 


2$(e,fc) 


Hik) 


i/(fc) 



r/ie upper bound 2$ /or deterministic algorithms holds for any lazy marking 
algorithm (e.g. LRU, FiFOj but not for FwF. The upper bound H(k) for ran- 
domized algorithms holds for PARTITION and EQUITABLE. The weaker upper 
bound 2H{k) - 1 holds for RMark. 

In all cases except one, the competitive ratios of (lazy) deterministic and 
randomized marking algorithms are at least $ — 1 and at most 2$. The excep- 
tion is that for e above the threshold l/{k+ 1), the randomized ratio is H{k) 
(independently of e). To understand the behavior of the function <i>, consider 
the case e = 1/rt for some integer n. Then 



$(l/n, k) = 1 + H{n- 1) + 



-Hin — k) when n > k, 
k — n when n < k. 



Recall that H{k) = 1/^ ~ + !)• The threshold of $ around e « 1/A; is 
very sharp: 

< 1 + lni when e = {l-S)/k, 
<i>(e, fc)is ^ w Infc whene=l/fc, 

when e — {1 + 6)/k. 



> h ^ 



2 Technical Overview 

We refine an existing worst-case competitive analysis for paging [ p^ , |^, |^ to take 
into account the probabilistic restrictions on the adversary. We call this partic- 
ular analysis the factor-two- analysis because for our purposes (and when used 
to analyze the randomized marking algorithm H) it (at best) can approximate 
Opt only within a factor of two. 



2.1 Review of Factor-Two- Analysis in the Standard Model 

Let A be any paging algorithm and let s = siS2 . . . Sn be any sequence of 
requests. The phases of s partition the times {l,2,...,n} into intervals as 
follows. Define t{l) — 1. For £ G M inductively define 

t(f + 1) = 1 + max{j < n : \{st(i), St(i)+i, St(f)+2, • ■ • < k}. 

For each e W such that t{t) < n, the £th phase of s is defined to be the time 
interval {t{£),t{£) -{-1, . . . ,t{£+l) — 1}. Thus, during each phase except the last, 
k distinct items are requested. 
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In the context of a particular time t, the current request refers to the request 
St- This phase or, synonymously, the current phase means the phase containing 
the time t. An item is requested previously in this phase if it is requested during 
this phase before time t. In the additional context of a particular schedule 
iS* = S1S2 ■ ■ ■ Sn for s, the cache refers to the set St-i of items in the cache 
before request t. Then at each time t, each item is classified with respect to its 
status before request St as follows: 

new — not requested previously in this phase or in the last phase. 

old — requested during the last phase, but not previously in this phase. 

redundant — requested previously in this phase. 

worrisome — requested in the last phase or previously in this phase, but not 
in the on-line algorithm's cache. 

Each request is classified as well, according to the status of the requested item. 
For instance, a request st is new if the requested item was new after request 
St-i- Each phase (except possibly the last) has k non-redundant requests, each 
one of which is either new or old. Define 

new(s) — the total number of new requests in sequence s. 

new_in_ph(£) — (in the context of some sequence) the total number of new 
requests in the £th phase of the sequence. Here i is any positive integer. 
If = or there is no ^th phase, define new_in_ph(^) to be 0. 

The relevance of the new requests is as follows. 

Lemma 1 ([|, |l5|) new(s)/2 < Opt(s) < new(s) 



Proof: Consider the {£ — l)st and £th phases of s for any £. The number of 
distinct items requested in the two phases is fc+new_in_ph(£). Thus, the number 
of evictions incurred by Opt during the two phases is at least new_in_ph(^) and 

Opt(s) > max "I ^ new_in_ph(^), ^ new_in_ph(f) > 

^ e odd e even ■' 

> ^ new_in_ph(f ) /2 new(s) /2. 



On the other hand, the following schedule costs at most new(s). At the 
beginning of each phase, evict those items that are not requested during the 
phase and bring in the items that are not in the cache but are requested during 
the phase. After each phase ends, the items requested during that phase are in 
the cache, so the number of evictions in the next phase is just the number of 
new requests in that phase. Thus, the cost of this schedule is new(s). Since the 
schedule produced by Opt is at least as good, Opt(s) < new(s). q 
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By the amortized cost incurred by Opt during a phase, we mean half the 
number of new requests in that phase. By the lemma above, the total cost 
incurred by Opt is at least the total of these amortized costs and at most twice 
the total. To show bounds on the competitive ratio of A, wc use the standard 
method of bounding the cost incurred by A during a phase divided by the 
amortized cost incurred by Opt during the phase. For instance, if this ratio 
is at most c for each phase of a sequence s, then it follows immediately that 
A{s) < c ■ Opt(s). 

One intuition for understanding RMark and other marking algorithms such 

as Lru, Fifo, and even FwF is that they arc emulating the schedule described 
in the proof above that Opt(s) < new(s). That is, during each phase, the 
"goal" (intuitively speaking) is to get the items that will be requested during 
the phase into the cache. From this point of view, once an item is requested 
during a phase, it should be kept in the cache. This is the principle that defines 
a deterministic marking algorithm. 

If this principle is followed, then only non-redundant requests can cause evic- 
tions. Since the phase ends after k non-redundant requests, any deterministic 
marking algorithm incurs a cost of at most k during the phase. This means that 
in the standard model, the competitive ratio is at most k (Opt also incurs at 
least one eviction per phase). Conversely, the adversary can force a ratio of k 
against a deterministic on-line algorithm by making one new request each phase 
and then making k — 1 requests, each to whichever old item is not currently in 
the cache. 

2.2 Factor-Two-Analysis for the Diffuse Adversary 

In the standard model, the adversary can force each old request (i.e., each 
non-redundant request to an item requested in the previous phase) to cause an 
eviction. In the diffuse adversary model, this is not so, because the adversary 
can only assign e probability to each item. The adversary may have to assign 
probability to redundant and/or new items. To adapt the standard analysis 
to the diffuse adversary setting, we analyze the extent to which the adversary 
can assign probability to old items. Recall that old items that are not in the 
on-line algorithm's cache are called worrisome, as are requests to such items. 
We analyze the extent to which the adversary can cause worrisome requests. 

We next sketch the argument for the upper bound, glossing over issues of 
probabilistic conditioning, in order to convey the intuition. In the subsequent 
section we give a formally correct treatment. We then give the lower bound; 
the intuition for the lower bound is similar to that of the upper bound. 

Consider the ^th phase for any i. There are k non-redundant requests in the 
phase (except possibly for the last phase, which may have fewer). Consider the 
state of any marking algorithm DMark just before the (i + l)st non-redundant 
request, for 1 < i < /c — 1. 

The i redundant items are marked and in the cache. Of the k items requested 
last phase, at most iiew_in_ph(£) are worrisome (out of the cache). Thus, the 
adversary can assign at most e new_in_ph(€) probability to worrisome items. 
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Since there are only i redundant items, the adversary has to assign at least 
1 — ei probability to non-redundant items. Therefore, the probability that the 
request will be worrisome, given that the request turns out to be non-redundant, 
is at most 

enew_in_ph(£) new_in_ph(i!) 

1 — ei — i 

(or 1 if this quantity is negative or more than 1). Summing over i, adding 
new_in_ph(^) for the evictions due to new requests, and dividing by new_in_ph(£) /2 
(the amortized cost incurred by Opt for the phase) gives the desired upper 
bound 2$ on the competitive ratio. 

The above upper bound can be turned into a roughly equivalent lower bound. 
The lower bound loses a factor of 2 because of our use of new requests in 
approximating Opt(s). It loses an additional additive term of 1 in some cases; 
we revisit this issue after presenting the lower bound. 

3 Upper Bound for Deterministic Algorithms 

Next we prove the upper bounds on deterministic strategies in Theorem |l|: 

Lemma 2 For any lazy deterministic marking algorithm DMark and D £ A^, 

ED[DMARK(r)] < 2$(e, k) ■ ED[OPT(r)] + 0(1) 

Proof: Without loss of generality, assume that D generates only sequences 
whose last phase has k non-redundant requests. (Otherwise we can easily modify 
the distribution so that the condition is satisfied, while increasing E[OPT(r)] by 
at most the constant k.) In the context of the random sequence r, define the 
following random variables and events. 

Re,i — the (i + l)st non-redundant request in the £th phase of r, if there is an 
^th phase. 

pref ix(i?) — the prefix of r up to but not including request R of r. 

new_bef (i?) — the number of new requests before request R in the phase of r 
containing R. 

new_in_ph(^) — the total number of new requests made in the ith phase of r, 
if there is an ^th phase, otherwise 0. 

worrisome(i?) — the event that request i? of r is worrisome. 

In what follows, we abuse notation slightly as follows. By the event "pref ix(i?^ i) = 
s" , we mean "there is an ^th phase in r and the prefix of r preceding request 
Ri^i is sequence s". Similarly, by the event "worrisome(_Rf_i)" , we mean "there 
is an £th phase in r and the request Rg^i in that phase is worrisome" . 
We start by proving the following claim: 
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Claim 1 Fix any £ and i {1 < i < k — 1). Let s be any sequence such that the 
event pref ix(i?^^i) = s can happen. That is, s has £ phases, and the last phase 
of s has i non-redundant requests. Then 



Pr[worrisome(i?£.i) | pref ix(i?£.i) = s] < E 



new_bef 

max{l, e^^ — i} 



Conditioning on "pref ix(i?£.i) = s" lets us use the restrictions on the adversary. 

Here is the proof of Claim ^ In the event that s is a prefix of r, consider the 
random variable where i |s| + 1. (There must be such a request because 
i < k and each phase of r, including the last, by the assumption at the beginning 
of the proof, has k non-redundant requests.) 

The event pref ix(i?£^i) = s happens if and only if s is a prefix of r and rt is 
non-redundant. If pref ix(i?£.i) = s, then the event worrisome(i?£^i) happens 
if and only if rt is worrisome. Thus, 

Pr(worrisome(i?£_,;) | pref ix(i?£.i) — s) 

= Pr(worrisome(r() | s is a prefix of r and rt is non-redundant) 
Pr(worrisome(r4) | s is a prefix of r) 
Pr(rt is non-redundant | s is a prefix of r) 

Assume that s is a prefix of r. After processing s, DMark has all but new_bef (rj) 
of the items requested in the previous phase in the cache. Thus, the adversary 
can assign at most enew_bef (rt) probability to worrisome items. Thus, the nu- 
merator above is at most enew_bef (rt). Since there have been i non-rcdundant 
requests in this phase before rt, there are only i redundant items, so the de- 
nominator above is at least 1 — ei. To finish the proof of Claim note that 
E[new_bef (i?£.i) | pref±x(Ri^i) — s] = new_bef (r^). 

Now fix i and £. In the set of events {pref ix(i?£^i) = s | s is a sequence}, 
exactly one event happens. Thus, the bound in Claim [l| holds unconditionally: 



Pr[worrisome(i?£^i)] < E 



new_bef (i?£.i) 



maxjl, — i} 

Since new_bef (i?f^i) < new_in_ph(£), it follows that for all i and i, 

new_in_ph(i?) 



Pr[worrisome(i?£.i)] < E 



max{l, e ^ ~ i} 

Since DMARK(r) is the number of new or worrisome requests in r. 



E[DMARK(.)] < E[Enew.in.ph(.) + E^iSf^?^l 

= (l-K^max{l,e"i-i}"i) ■ E[^new_in_ph(£) (2) 



= $(e, fc) E[new(r)] (3) 
< $(e,fc) E[OPT(r)/2] (by Lemma 1). (4) 
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4 Lower Bound for Deterministic Algorithms 

Next we prove the lower bounds on deterministic strategies in Theorem |l|: 

Lemma 3 For any e > 0, any k, and any deterministic on-line algorithm A, 
there is a distribution D S such that 

^D[A{r)] > ($(e, fc) - 1 + 1/m) • ED[OPT(r)]. 

where m = max{l, [e~^] — k}, and ED[OPT(r)] is arbitrarily large. 

Proof: We describe D by describing an adversary that requests items prob- 
abihstically subject to the hmitations of A^. Fix e > and fc > 0. Assume 
e > l/2fc (otherwise the desired lower bound is trivially satisfied, because 
$(l/2fc,fc) - 1 + 1/m is less than 1). 

The adversary requests the items in an on-line fashion, phase by phase. In 
the first part of each phase, the adversary makes m new requests by assigning 
probability only to items not previously requested. 

For each remaining request, the adversary assigns a probability to each item 
as follows. First priority is given to worrisome items (those previously requested 
in this phase or in the last one but not in the cache of A). Second priority is 
given to redundant items (those requested previously in this phase and in the 
cache). Third priority is given to the remaining old items (the items not yet 
requested this phase, but in the cache). 

Items are selected in order of priority and assigned as much probability as 
possible, subject to the constraint that no item is assigned probability more 
than e and the total probability assigned is 1. By the choice of m, we have 
{k + m)e > 1, so all three kinds of items suffice for all probability to be assigned. 

The adversary follows this strategy until fc distinct items have been re- 
quested, at which point the adversary begins a new phase. The adversary 
continues for N phases, where N is arbitrarily large so that OPT(r) is also 
arbitrarily large. 

This defines the distribution D & A^. Let r be a random request sequence 
chosen from D. Next we prove that E[A(?')] > Nm{^{e,k) — 1-1- 1/m). This 
proves the claimed bound, since OPT(r) < Nm (by Lemma p. Consider any £ 
s.t. 1 < ^ < A^. For i — m, . . . , k — 1, define 

worrisome(i?£^i) — the event that the ith non-redundant request of the ^th 
phase is worrisome. 

The expectation of A{r) is Nm + ^ Pr[worrisome(i?£.i)]. For any £ and i 
s.t. m < z < fc — 1, consider the time just before the (i + l)st non-redundant 
request of the ^th phase. There have been i non-redundant requests so far in 
the phase, so there are i redundant items. There have been m new requests 
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so far, so there are k + m items that were requested last phase or aheady this 
phase. Since the on-hne algorithm has at least m of these items not in the cache, 
there are at least m worrisome items. Thus, the adversary assigns at least em 
probability to worrisome items and at least ei probability to redundant items. 
(Unless em + ez > 1, in which case Pr[worrisome(i?^^i)] — 1 — the adversary 
forces a worrisome request.) Thus, the probability that the request is worrisome, 
conditioned on it being non-redundant, is 

Pr[worrisome(i?£_i)] > 



max{l — ei,em} max{e ^—i,m} max{e ^—1,1} 

The rightmost equality holds because the choice of m implies that either m = 1 
or e~^ —i > m. Adding the m new requests and summing over i = m, . . . ,k — l, 
the expected cost to A for each of the N phases is at least 

fe-i fc-i 



F — \ -TT ^ 1 



max{e ^ — i, 1} max{e ^ — z, 1} 

The rightmost expression is m ($(e, k) — 1 -\- \/m). q 

The adversary can probably be made a little stronger to get a slightly better 
lower bound when e < l/(fc + 1). In this case the issue of how the optimal 
adversary should fix m appears to be relatively subtle. This is why the lower 
bound loses the additive 1 with respect to the upper bound in this case. One 
small improvement to the above adversary would be, when the adversary is 
requesting new items, to use the opportunity to also allocate probability to 
worrisome items. 



5 Randomized Strategies 

In this section we finish the proof of Theorem Q by proving the upper and lower 
bounds for randomized strategies claimed there. By using what we already 
know, very little work is required to get the bounds. 

We first consider lower bounds. Fix e > and fc > 0. We start with the 
case e < l/(fc + 1). For simplicity we make the technical assumption that e~^ 
is an integer. This assumption is not too restrictive and allows us to reuse the 
deterministic lower bound as follows. 

Lemma 4 If e^^ is an integer greater than k, then the distribution D described 
in the proof of Lemma ^ is independent of the algorithm A. 

Proof: Consider that distribution. Within each phase, the random sequence 
r has requests to m new items, followed by requests restricted to a set of fc + to 
items, where to = max{l, e~^ — fc}, until k distinct items have been requested. 
The condition on e and the choice of m imply that to — e~^ — k, so that 
e=l/(fc-|-m). In this case, each phase simply consists of requests to m new 
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items, followed by a sequence of requests to the k + m items, where each request 
is chosen uniformly at random from those k + m items, until a total of k distinct 
items have been requested, after which the next phase begins. q 

This distribution generalizes a distribution defined in a previous lower bound 
on the competitive ratio of randomized on-line strategies against the standard 
adversary H, Thm. 8.7], Thm. 13.2]. (That lower bound is equivalent to our 
case m — I.) There and here, Yao's principle implies that for a random input r 
from any input distribution D, any randomized on-line algorithm Aj^ satisfies 

E[^fl(r)] >infE[A(r)] 

A 

where A ranges over all deterministic on-line algorithms. 

(Briefly, this is because Af( may be viewed as probabilistically picking some 
deterministic algorithm A, and then running A on the input r. Thus, Ed [^-r('')] = 
^^Pr[yl/j chooses A] ■'ED[A{r)] > inf^ ED[^(f)]. Here D can be any distribu- 
tion, but we take it to be the one defined in Lemma |^. The input r is randomly 



chosen from D. We refer the reader to |13, Thm. 13.2] or Ig, Thm. 8.7] for a 
full explanation of Yao's principle in this context.) 

By Lemma in the special case when is an integer greater than k, 
the distribution defined in the previous section is independent of the on-line 
algorithm A. Thus, by Yao's principle, the lower bounds proved there extend 
to randomized algorithms. This proves: 

Lemma 5 Suppose e < l/{k+ 1) and is an integer. Then the lower bound 
established in Lemma also applies to randomized on-line algorithms. 

Decreasing e only weakens the adversary. Thus, when e is not an integer, letting 
e' = l/[e^^] < e, the lower bounds hold with e' replacing e. 

Also, when e = l/(fc + 1) it is easy to verify that the above lemma implies 
that the ratios are at least H(k) = J^i V*- This proves: 

Lemma 6 Suppose e > l/(fc + 1). Then 7^7^(AJ > H{k). 

So the above two lemmas prove the lower bounds for randomized strategies 
claimed in Theorem |l|. What about the upper bounds? Because the diffuse 
adversary is no stronger than the standard adversary, we get immediately from 
previous results that: 

Lemma 7 For e < l/(fc+ 1), 7^7^(Ae, RMark) < 2$(e,fc). 

Fore > l/(fc+l), 7^7^(Ae,RMARK) < 2H{k)-l, wMe 7^7^(Ae, Partition) < 
H{k), and 7^7^(Ae, Equitable) < H{k). 

The first upper bound follows from the fact that Lemma || also applies to 
RMark (since the upper bound applies to any deterministic marking algo- 
rithm, i.e., any conditioning of RMark on a particular outcome of its random 
choices). The remaining upper bounds follow from known upper bounds on the 
competitive ratios of the various algorithms against the (stronger) standard ad- 
versary HJl^ |l|. Lemma 1^ proves the upper bounds on randomized strategies 
in Theorem 111. This completes the proof of that theorem. 
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6 Alternate Proof that LRU is Optimal 

For the record, we include here a "distillation" of Koutsoupias and Papadim- 
itriou's proof that Lru is optimal against the diffuse adversary A^. This version 
of the proof is shorter and self-contained, but does not give the intermediate 
results about work functions in the original proof. 

Given a request sequence s of items from a universe U, and an (arbitrary) 
initial ordering tt of the items, define the rank of an item x G U in s to he the 
rank of x in the following ordering: items that are requested in s are first, in 
order of last request; items that are not requested in s are next, ordered by tt. 

In analyzing an on-line paging algorithm, if s is the sequence of requests 
seen so far, then the most recently requested item currently has rank 1, the 
next most recently requested item currently has rank 2, etc. Without loss of 
generality, when specifying a request or the contents of the cache, we can specify 
each item by its current rank; this uniquely identifies the item. Except in the 
proof of Lemma ^ where we use both representations, items in this section are 
assumed to be specified by their current rank. 

Lemma 8 Let r and r' be two equal-length request sequences. Let r and r' , 
respectively, be the same sequences but with each request specified by rank (w.r.t. 
the same initial ordering and universe). Lf r dominates r' in the sense that 
h > r't for all t, then OPT(r) > OPT(r'). 

Proof: It suffices to prove the case when there is a single d such that = — 1 
but ft — <'t for all t ^ d. The general case then follows by induction. Assume 
such a d. 

How do r and r' differ? Consider the two sequences simultaneously for 
t — 1, 2, . . . , |r| in an on-line fashion. At each t focus on the ranks of the items 
in the two subsequences s = rir2 . . . rt and s' = r^ri^r' . . .r'^.. 

At each time t < d, for each item, the rank in s equals the rank in s'. 
Let X and x' be the items requested, respectively, in r and r' at time d. By 
assumption, just before time d, the respective ranks of x and x' are and — 1. 
What about just after time d? In sequence s, the rank of x changes to 1, while 
the rank of x' changes to r^. In sequence s', the rank of x stays r^, while the 
rank of x' changes to 1. For each item other than x or x' , the rank of the item 
is equal in both sequences. 

This means that the sequence of items requested by r is the same as the 
sequence of items requested by r', except that from time d to the end, the roles 
of X and x' are reversed: if r requests x (resp. x'), then r' requests x' (resp. x). 

Let i and i', respectively, be the times of the most recent requests to x and 
x' before time d. (If either item is being requested for the first time, then let i 
or i' equal 1, as appropriate.) By assumption = r^; — 1, so i < i' . 

Consider any schedule S for r. For any j with i' < j < d, consider obtaining 
S' from S by reversing the roles of x and x' from time j onward (i.e. swapping 
the two in Sj, ^j+i, . . .). By the established relation between r and r', S" will be 
a valid schedule for r'. To finish, we need only choose j so that S" costs no more 
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than S. In particular, at time j, S' should evict no more of the two pages {x, x'} 
than S does. If for some j, \{x',x} Ci Sj\ e {0,2} or \{x',x} n Sj-i\ e {0,2}, 
then this j clearly suffices. Otherwise there is a j such that {x', x}nSj-i — {x'} 
and {x', x} D Sj = {x}. Using this j, S' is cheaper than S. Q 

Theorem 2 ([^) Let D be any distribution D £ A^. Let A be any determin- 
istic on-line algorithm. Then there is a distribution D' G such that 

ED[l^R\Jir)] <ED'[A{r')] and ED[OPT{r)] > ED'[OPT{r')], 

where r and r' are randomly chosen according to D and D' , respectively. 
Thus, 7^(Ae) = 7^(A,,LRu). 

Proof: In what follows, we assume all items are specified not by name but by 
rank (with respect to some sequence implicit in context, the universe U of the 
items requested by D, and an arbitrary initial ordering). 

Intuitively, the argument is the following. At each request, we pair each 
page X in A's cache but not in Lru's cache with a unique page /(x) in Lru's 
cache but not in A's. For each such x, if D assigns more probability to x than 
to /(x), then we shift some of the probability from x to /(x). This gives us a 
modified assignment of probabilities to pages for the request; in this way we 
define D' . We show that this shifting procedure ensures that at each request, 
A is as likely to fault (on a request from D') as Lru was (on the corresponding 
request from D). Furthermore, D' is better for Opt than D is, because when 
we shift probability from x to /(x), we know that, as x is not in Lru's cache but 
/(x) is, we are shifting probability from a higher-ranked page to a lower-ranked 
page (in the sense of Lemma ^) . 

Formally, the following random experiment defines the distribution D' by 
describing how to choose a random sequence r' according to that distribution. 
Choose a random sequence r according to D. Reveal r in an on-line fashion, one 
request at a time, producing each corresponding request of r' as follows. 

Let L denote the cache of Lru (specified by rank with respect to s) after 
processing s = ri . . . rt-i. Similarly, let A denote the cache (specified by rank 
with respect to s') of A after processing s' = r'j^ . . . r^_^. Let / be any 1-1 mapping 
from A — L into L — A (note |A| < |L|) and define (in the context of s and s') 

X = {x e A - L I p(/(x)) < p(x)} , where 
p(x) = Pr(x|s). 

D 

X is the set of pages from which we want to shift probability. 

Finally, determine rt as follows. First set = rt, but if ft G X^ change r[ to 
/(ri) with probability p{f{rt))/p{rt). 

This completes the random experiment that gives r' and so defines D' . Each 
outcome of this experiment determines a pair of random variables (r, r'). 

We use "Pr£)'(X|s, s')" to denote the probability of event X conditioned 
on s and s' being prefixes of r and r'. The following claim characterizes the 
distribution of rj conditioned on this event. 
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Claim 2 Fix any two sequences s and s' with length t — 1. Define p'(x) = 
Pi'£)'(i't — x|s, s'). Then for each x, p'(x) = p(7r(x)), where tt is t/ie permutation 
defined by 7r(x) = x unless x ^ X or /(x) e A:", m which case 7r(x) = /(x) and 
7r(/(x)) = X. 

The claim follows by direct calculation based on the last line of the experiment. 

Claim 3 Let r, r', s, and s', be as in Claim ||. Then 

Pr[LRU faults on Vt \ s, s'] < Vr[A faults on r[ \ s,s']. 

Why? It suffices to show that for every item x in A, there is a unique item y in 
L such that p'(x) < p{y). But by Claim ^ and the choice of A", this is the case: 
take y = X unless x G A — L, in which case take y — /(x) e L — A. 

Note that in Lemma |[ equality does not necessarily hold because A may 
not have k pages in its cache, or it may have "irrelevant" pages in its cache — a 
page X that D requests with less probability than the corresponding page /(x) 
in Lru's cache (so no probability is shifted from x to /(x)). 

Claim 4 The first part of the theorem is true: ED[LRU(r)] < Ec [^(''')] • 

This follows directly from Claim |3[ To see it formally, letting s and s' range 
over all equal-length pairs of sequences, we have 

ED[LRu(r)] = ^Pr(s,s')Pr[LRU fauhs on ("isl+i I ^7 s J 

s.s' 

< ^Pr(s,s')Pr[A faults on r[^,|^Js,s'] 

S.s' 

Above Pr£)'(s,s') denotes the probability that s is a prefix of r and s' is a prefix 
of r' in the random experiment. 

Claim 5 The second part of the theorem is true: E_D[OPT(r)] > ED'[OPT(r')]. 

Since the random experiment described above produces the same distribution 
on r as D does, it suffices to prove the inequality assuming that the pair (r, r') 
is generated by that experiment. Since Lru keeps the most recently requested 
items in its cache, and / : (A — L) ^ (L — A), we have x < /(x). Thus, in any 
outcome, r dominates r' (in the sense of Lemma ||) and so OPT(r) > OPT(r'). 
This proves the claim. 

Claim 6 The distribution D' defined by the random experiment is in A^. 

This also follows directly from Claim ^. To prove it in detail, we need to show 
that for any s' and x, Pr£)/(x|s') < e. But 

Pr(x|s') = ^Pr(s)Pr(r;=x|s,s') < J] Pr(s) e = e. 
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Above Pr£i(s) denotes the probability that s is a prefix of r, and s ranges over all 
sequences of length |s'| = t — 1. The second-to-last inequality follows because by 
Claim ^ each Pr£)/(rJ x|s,s') equals Pr£)(y|s) forsomey, and by the assumption 
that 13 G Aj, Pr£)(y|s) < e. This proves the claim (and the theorem!). rn 
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